Improvements to the equal-parameter BIC for speaker diarization
نویسندگان
چکیده
This paper discusses a set of modifications regarding the use of the Bayesian Information Criterion (BIC) for the speaker diarization task. We focus on the specific variant of the BIC that deploys models of equal or roughly equal statistical complexity under partitions of different number of speakers and we examine three modifications. Firstly, we investigate a way to deal with the permutation-invariance property of the estimators when dealing with mixture models, while the second is derived by attaching a weakly informative prior over the space of speaker-level state sequences. Finally, based on the recently proposed segmental-BIC approach, we examine its effectiveness when mixture of gaussians are used to model the emission probabilities of a speaker. The experiments are carried out using NIST rich transcription evaluation campaign for meeting data and show improvement over the baseline setting.
منابع مشابه
Improving Speaker Diarization
This paper describes the LIMSI speaker diarization system used in the RT-04F evaluation. The RT-04F system builds upon the LIMSI baseline data partitioner, which is used in the broadcast news transcription system. This partitioner provides a high cluster purity but has a tendency to split the data from a speaker into several clusters when there is a large quantity of data for the speaker. In th...
متن کاملInvestigation of Cross-Show Speaker Diarization
The goal of cross-show diarization is to index speech segments of speakers from a set of shows, with the particular challenge that reappearing speakers across shows have to be labeled with the same speaker identity. In this paper, we introduce three cross-show diarization systems namely Global-BIC-Seg, Global-BIC-Cluster, and Incremental. We compared the three systems on a set of 46 English sci...
متن کاملPhonetic subspace mixture model for speaker diarization
This paper presents an improved distance measure for speaker clustering in speaker diarization systems. The proposed phonetic subspace mixture (PSM) model introduces phonetic information to the BIC distance measure. Therefore, the new PSM model-based BIC distance measure can remove the effect of phonetic content on the diarization results. The typical BIC distance measure can be seen as a speci...
متن کاملThe TNO Speaker Diarization System for NIST RT05s Meeting Data
The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as well. This is based on decoding the speech signal using two Gaussian Mixture Models trained on silence and...
متن کاملSpeaker diarization using normalized cross likelihood ratio
In this paper, we present the Normalized Cross Likelihood Ratio (NCLR) and the advantages of using it in a speaker diarization system. First, the NCLR is used as a dissimilarity measure between two Gaussian speaker models in the speaker change detection step and its contribution to the performance of speaker change detection is compared with those of BIC and Hostelling’s T-Statistic measures. T...
متن کامل